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ABSTRACT 

Using the differential precision methods developed previ¬ 
ously by the same authors, we study the p-adic stability 
of standard operations on matrices and vector spaces. We 
demonstrate that lattice-based methods surpass naive meth¬ 
ods in many applications, such as matrix multiplication and 
sums and intersections of subspaces. We also analyze de¬ 
terminants, characteristic polynomials and LU factorization 
using these differential methods. We supplement our obser¬ 
vations with numerical experiments. 

Categories and Subject Descriptors 

1.1.2 [Computing Methodologies]: Symbolic and Alge¬ 
braic Manipulation - Algebraic Algorithms 

General Terms 

Algorithms, Theory 

Keywords 

p-adic precision; linear algebra; ultrametric analysis 

1. INTRODUCTION 

For about twenty years, the use of p-adic methods in sym¬ 
bolic computation has been gaining popularity. Such meth¬ 
ods were used to compute composed products of polynomi¬ 
als [2], to produce hyperelliptic curves of genus 2 with com¬ 
plex multiplication [4], to compute isogenies between elliptic 
curves [8] and to count points on varieties using p-adic co¬ 
homology theories (c/. [5,7] and many followers). However, 
a general framework allowing a precise study of p-adic pre¬ 
cision — the main issue encountered when computing with 
p-adic numbers — was designed only recently in [3]. 

In [3], we advocate the use of lattices to track the preci¬ 
sion of vectors and points on p-adic manifolds. The main 
result of loc. cit., recalled in Proposition 2.1, allows for the 
propagation of precision using differentials. While represent¬ 
ing precision by lattices carries a high cost in space and time 
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requirements, the reduced precision loss can sometimes over¬ 
whelm these costs within a larger algorithm. In this paper, 
we apply the ideas of [3] to certain linear algebraic tasks. 

Main Results. We give a number of contexts where lattice- 
based precision methods outperform the standard coordinate- 
wise methods. The two most striking examples are an anal¬ 
ysis of matrix multiplication and of sums and intersections 
of subspaces. In Proposition 3.1, we give a formula for the 
lattice-precision of the product of two matrices. In Figure 
2, we describe the precision loss in multiplying n matrices, 
which appears linear in n when using standard methods and 
logarithmic in n when using lattice methods. We also give 
an example in Figure 4 where lattice methods actually yield 
an increase in precision as the computation proceeds. 

Organization of the paper. Section 2.1 recalls the the¬ 
ory of precision developed in [3] and defines a notion of dif¬ 
fuse precision for comparing to coordinate-wise methods. In 
particular, we recall Proposition 2.1, which allows the com¬ 
putation of precision using differentials. Proposition 2.5 in 
Section 2.2 is a technical result that will be used to deter¬ 
mine the applicability of Proposition 2.1. 

In Section 3.1, we analyze matrix multiplication and re¬ 
port on experiments that demonstrate the utility of lattice- 
based precision tracking. Section 3.2 finds the precision of 
the determinant of a matrix, and Section 3.3 applies the re¬ 
sulting formula to characteristic polynomials. We define the 
precision polygon of a matrix, which gives a lower bound on 
the precision of the characteristic polynomial. This poly¬ 
gon lies above the Hodge polygon of the matrix; we give 
statistics on the difference and the amount of diffuse preci¬ 
sion. Finally, we apply Proposition 2.1 to LU factorization 
and describe experiments with lattice-based precision anal¬ 
ysis. Section 4.1 reviews the geometry of Grassmannians, 
which we apply in Section 4.2 to differentiating the direct 
image, inverse image, sum and intersection maps between 
Grassmannians. We then report in Section 4.3 on track¬ 
ing the precision of subspace arithmetic in practice. In the 
appendix, we give a proof of Proposition 2.4. 

The code used to make experiments presented in this pa¬ 
per is available at https://github.com/CETHop/padicprec. 

Notation. Throughout the paper, K will refer to a com¬ 
plete discrete valuation field. Usual examples are finite ex¬ 
tensions of Qp and Laurent series fields over a field. We 
denote by val : if ^ Z U {-boo} the valuation on K, by Ok 
the ring of integers and by tt € if an element of valuation 
1. We let II • II be the norm associated to val. 


2. THE THEORY OF P-ADIC PRECISION 

The aim of this section is to briefly summarize the content 
of [3] and All in certain details. 

2.1 Lattices as precision data 

In [3], we suggest the use of lattices to represent the pre¬ 
cision of elements in IT-vector spaces. We shall contrast 
this approach with the coordinate-wise method used in Sage, 
where the precision of an element is specified by giving the 
precision of each coordinate separately and is updated after 
each basic operation. 

Consider a finite dimensional^ normed vector space E de¬ 
fined over K. We use the notation || • ||_b for the norm on E 
and (resp. B^(r)) for the open (resp. closed) ball of 

radius r centered at the origin. A lattice L C if is a sub-Orc- 
module which generates E over K. Since we are working in a 
ultrametric world, the balls B^{r) and B^{r) are examples 
of lattices. Actually, lattices should be thought of as spe¬ 
cial neighborhoods of 0 and therefore are good candidates 
to model precision data. Moreover, as revealed in [3], they 
behave quite well under (strictly) differentiable maps: 

Proposition 2.1. Let E and E be two finite dimensional 
normed vector spaces over K and f : U ^ F be a function 
defined on an open subset U of E. We assume that f is 
differentiable at some point vo £ U and that the differential 
f'{vo) is surjective. Then, for all p £ (0,1], there exists a 
positive real number S such that, for all r G (0, 5), any lattice 
H such that Bf {pr) C H G B^ (r) satisfies: 

f{vo + H)^ f{vo) + f'{vo){H). (1) 

This proposition enables the lattiee method of tracking 
precision, where the precision of the input is specified as a 
lattice H and precision is tracked via differentials of the steps 
within a given algorithm. The equality sign in Eq. (1) shows 
that this method yields the optimum possible precision. We 
refer to [3, §4.1] for a more complete exposition. 

In [3], we also explained that if / is locally analytic, then 
the constant 5 appearing in Proposition 2.1 can be expressed 
in terms of the growing function A(/) defined by 

A(/)(n) = log(sup^g^-(^„j ||/(h)||) 

with the convention that A(/)(n) = -|-oo if / does not con¬ 
verge on Bfie"). We refer to [3, Proposition 3.12] for the 
precise statement. We state here the case of integral polyno¬ 
mial functions. A function f : E ^ F is said to be integral 
polynomial if it is given by multivariate polynomial functions 
with coefficients in Ok in any (equivalently all) system of 
coordinates associated to a Ox-basis of Be{1). 

Proposition 2.2. We keep the notation of Proposition 2.1 
and assume in addition that f is integral polynomial. Let C 
be a positive real number such that Bf{1) C f' {vo){Be{C)). 
Then Proposition 2.1 holds with 5 = C ■ p~^. 

In [3, Appendix A], the theory is extended to manifolds 
over K, where the precision datum at some point a: is a 
lattice in the tangent space at x. Propositions 2.1 and 2.2 
have analogues obtained by working in charts. We use this 
extension to compute with vector spaces in §4. 

^The framework of [3] is actually those of Banach spaces. 
However, we will not need infinite dimensional spaces here. 


We will use the following definition in contrasting lattice 
and coordinate-wise methods. Suppose E is equipped with a 
basis (ei,. .., Cn) and write TTi : E —>■ Kei for the projections. 

Definition 2.3. Let H G E he a. lattice. The number of 
diffused digits of precision of H is the length of Ho/H where 

Ho = 7ri(H)©---©7r„(H). 

If H represents the actual precision of some object, then 
Ho is the smallest diagonal lattice containing H. Since 
coordinate-wise methods cannot yield a precision better than 
Ho, k provides a lower bound on the number of p-adic digits 
gained by lattice methods over standard methods. 

2.2 A bound on a growing function 

In the next sections, we will compute the derivative of 
several standard operations and sometimes give a simple 
expression in term of the input and the output. In other 
words, the function / modeling such an operation satisfies 
a differential equation of the form f'=go (/, id) where g 
is a given — and hopefully rather simple — function. The 
aim of this subsection is to study this differential equation 
and to derive from it certain bounds on the growing function 
A(/). We will assume that K has characteristic 0. 

Let E, F and G be finite-dimensional normed vector spaces 
with U G E and V C F and W G G open subsets. General¬ 
izing the setting above, we consider the differential equation: 

f'=g°{f,h). (2) 

Here g : V x W ^ Hom(i5, F) and h : U ^ W are known 
locally analytic functions and f : U ^ V is the unknown 
locally analytic function. In what follows, we always assume 
that V and W contain the origin, /(O) = 0, h{0) — 0 and 
g{0) 7 ^ 0. These assumptions are harmless for two reasons: 
first, we can always shift / and h (and g accordingly) so 
that they both vanish at 0, and second, in order to apply 
Proposition 2.2 the derivative f'{0) needs to be surjective 
and therefore a fortiori nonzero. 

We assume that we are given in addition two nondecreas¬ 
ing convex functions Ag and Ah such that A{g) < Ag and 
A{h) < Ah. We suppose further that there exists v such 
that Ag is constant on the interval (—oo,iz]^. We introduce 
the functions and Af defined by: 

X a X < n, 

+0O otherwise; 

and Af{x) = Tv o (id + Ag o Ah){x + a), 

where a is a real number satisfying ]]n!|] > e““" for all n. 
If p is the characteristic of the residue field, a suitable value 
for a is a = ■ log ]]p]] if p > 0 and a = 0 if p = 0. The 

next Proposition is proved in Appendix A. 

Proposition 2.4. We have A(/) < Af. 

Figure 1 illustrates Proposition 2.4. The blue plain line 
represents the graph of the function A/. A quick computa¬ 
tion shows that, on a neighborhood of —oo, this function is 
given by A/(x) = a; + a + p where p is the value that Ag 
takes on the interval {—oo,v\. Proposition 2.4 says that 
the graph of A(/) lies below the plain blue line. More¬ 
over, we remark that the Taylor expansion of f{z) starts 

^We note that this assumption is fulfilled if we take Ag = 
A{g) because we have assumed that p(0) does not vanish. 





Figure 1: Admissible region for the graph of A(/) 

with the term < 7 ( 0 ) 2 :. Hence, on a neighborhood of — 00 , we 
have A{f){x) = a; + log ||( 7 ( 0 )||. Using convexity, we get: 
A{f){x) > X + log ||<?(0)|| for all a: € R. In other words, the 
graph of A(/) lies above the brown line. Furthermore, we 
know that the slopes of A(/) are all integral because / is 
locally analytic. Hence, A(/) cannot lie above the dashed 
blue line defined as the line of slope 2 passing through the 
first break point of the blue plain line, which has coordinates 
{yo — a — fi, yo) with yo = min(A)))^(i/) + fi, v). As a conclu¬ 
sion, we have proved that the graph of A(/) must coincide 
with the brown line until it meets the dashed blue line and 
then has to stay in the green area. 

As a consequence, we derive the following proposition, 
which can be combined with Proposition 3.12 of [3]. Follow¬ 
ing [3], if i/p is a convex function and u € R, we define 

tp>v : 2 ; 1 -^ inf {‘p{x + y) — vy). 

y>0 ' 

It is the highest convex function with (p>„ < and > v. 

Proposition 2.5. Keeping the above notation, we have: 

A{f)> 2 {x) < 2{x -I- a -I- 7 r) - mm{A'^^{v) -|- p, u) 

for all X < min(A)))^(i/) — a, 1/ — 71 — a). 

Proof. The inequality follows from the fact that y — 
2[xa-\- pl) — yt, is the equation of the dashed blue line. □ 

Remark 2.6. In certain situations, it may happen that the 
function / is solution of a simpler differential equation of 
the form If this holds, Proposition 2.5 gives the 

bound A(/)> 2 (®) < 2 (a; + a + y) — v for x < v — y — a. 

Beyond this particular case, we recommend choosing the 
function h by endowing F with the second norm || 2 :||i:’ = 
e^ ■ ||a;||F {x G F) and taking h : {F, || • ||f) {F, || • H)?) 

to be the identity on the underlying vector spaces. The 
function A{h) : R ^ R then maps x to x + y and we can 
choose Ah ~ A{h). 

3. MATRICES 

Let Mm,n{K) denote the space of m x n matrices over 
K. We will repeatedly use the Smith decomposition for 


M G Mm,n{K), which is M — Um ■ Am ■ Vm with Um 
and Vm unimodular and Am diagonal. Write ai{M) for the 
valuation of the (i, i)-th entry of Am, and by convention set 
ai{M) — - 1-00 if i > min(m,n). Order the ai{M) so that 
Oi{M) < cri+i(M). 

3.1 Multiplication 

To begin with, we want to study the behavior of the preci¬ 
sion when performing a matrix multiplication. Let r, s and 
t be three positive integers and assume that we want to mul¬ 
tiply a matrix A G Mr,s{K) by a matrix B G Ma,t{K). This 
operation is of course modeled by the integral polynomial 
function: 

Vr,s,t ■■ Mr,s{K) X Ms,t{K) ^ Mr,t{K) 

{A,B) i-> AB. 

According to Proposition 2.1, the behavior of the precision 
when computing AB is governed by P'„ t(A,i3), the linear 
mapping that takes a pair {dA, dB) to A ■ dB -\- dA ■ B. 

To £x ideas, let us assume from now that the entries of 
A and B all lie in Ok and are known at the same precision 
O(n^). In order to apply Propositions 2.1 and 2.2, we then 
need to compute the image of the standard lattice £0 = 
Mr,s(OK) X Ms,t(OK) under P',, i(A, B). It is of course 
contained in Mr,t(OK); this reflects the obvious fact that 
each entry of the product AB is also known with precision 
at least Nonetheless, it may happen that the above 

inclusion is strict, meaning that we are gaining precision in 
those cases. 

Set Ui = <Ti(j4) and F = <Ji{B), and define Mr,t{{ai), (bj)) 
as the sublattice of Mr,{(die) consisting of matrices M = 
(Mij) such that val(Mij) > mm{ai,bj) for all {i,j). 

Proposition 3.1. With the above notation, we have 

V'r,.,AA,B){£o) = Ua ■ Mr,ti{ai), ( 67 )) • Vb 

and length(p^;i^^i^^^) =^min(ai,b 7 ) 

Proof. We write A • dB + dA • B = Ua • M • Vb with 

M ^ Aa-Va- dA-Vg^ + -dB-UB- Ag. 

When dA varies in Ma,b{OK) so does Va ■ dA ■ Vg^ and 
therefore the first summand in M varies in the subspace 
of MryiOx) consisting of matrices whose entries on the i- 
th row have valuation at least ai. Arguing similarly for 
the second summand, we deduce the first statement of the 
Proposition. The second statement is now clear. □ 

From the perspective of precision, the second statement 
of Proposition 3.1 means that the computation of AB gains 
5])^ .min[ai,hj) significant digits in absolute precision® as 
soon as N > min(a7.,bt) (c/. Proposition 2.2). However, 
many of these digits are diffused in the sense of Definition 
2.3. To change bases in order to make this increased pre¬ 
cision visible with coordinates, write AB = Ua ■ P ■ Vb 
with P = A A ■ Va • Ub ■ Ab- Tracking precision in the 
usual way, the (i,j)-th entry of P is known at precision 
Multiplication by Ua and Vb then dif¬ 
fuses the precision across the entries of AB. 

®We note that, on the other side, the valuation of the entries 
of AB may increase, meaning that we are also losing some 
significant digits if we are reasoning in relative precision. 
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Figure 2: Average loss of precision in Algorithm 1 

We now consider the impact of tracking this diffuse pre¬ 
cision. Although the benefit is not substantial for a single 
product of matrices, it accumulates as we multiply a large 
number of matrices. We illustrate this phenomenon with the 
following simple example. 


Algorithm 1: exainple_product 

Input: a list (Mi,..., M„) of square matrices of size d. 

1. Set P to the identity matrix of size d 

2. for j = 1,... ,n do compute P = PMi 

3. return the top left entry of P 


Figure 2 compares the number of signihcant digits in relative 
precision we are losing on the output of Algorithm 1 when 
we are using, on the one hand, a standard coordinate-wise 
track of precision and, on the other hand, a lattice-based 
method to handle precision. We observe that, in the first 
case, the number of lost digits seems to grow linearly with 
respect to the number of multiplications we are performing 
(that is n) whereas, in the second case, the growth seems 
to be only logarithmic. It would be nice to have a precise 
formulation (and proof) of this heuristic. 

Note that multiplication of many random matrices plays a 
central role in the theory of random walks on homogeneous 
spaces [1]. Better stability in computing such products helps 
estimate Lyapunov exponents in that context. 

3.2 Determinant 

The computation of the differential of det : M„,„ {K) K 
is classical: at a point M it is the linear map 

det'(M) : dM ^ Tr(Com(M) • dM), 

where Com(M) stands for the comatrix of M, which is 
det(M)M“^ when M is invertible. If rank(M) < n — 1, 
then det'(M) is not surjective and we cannot apply Propo¬ 
sition 2.1. Therefore, we suppose that rank(M) > n — 1 for 
the rest of this section. 

As with matrix multiplication, we first determine the im¬ 
age of the standard lattice Co = Mn.niOK) under det'(M). 

Proposition 3.2. Setting v = ai{M) -I- • • • -I- (j„_i(M), we 
have det'(M)(£o) = tv^Ok. 

Proof. From the description of det'(M), we see that it 
is enough to prove that the smallest valuation of an entry of 
Com(M) is V or, equivalently, that the ideal of Ok generated 
by all minors of M of size (n — 1) is But this ideal 

remains unchanged when we multiply M on the left or on 
the right by a unimodular matrix. Thus we may assume 
that M = Am, and the result becomes clear. □ 


In terms of precision, Proposition 3.2 implies that if M is 
given at flat precision 0{n^) with N > v, then det(M) is 
known at precision Thus we are gaining v digits 

in absolute precision or, equivalently, losing an{M) digits of 
relative precision. Furthermore, one may compute det(M) 
with this optimal precision by finding an approximate Smith 
decomposition with Am known at precision 0(7r^) and mul¬ 
tiplying its diagonal entries. 

3.3 Characteristic polynomials 

Write char : Mn.niK) —>■ K[X] for the characteristic poly¬ 
nomial, and K<in[X] C for the subspace consisting of 

polynomials of degree less than n. Then the differential of 
char at a point M is 

char'(M) : dM ^ Tr(Com(X - M) • dM). 

The image is the A-span of the entries of Com(X—M), 
which is clearly contained within A<„ [A]. In fact, the image 
will equal A<„[A] as long as M does not have two Jordan 
blocks with the same generalized eigenvalue. For now on, 
we make this assumption. 

Recall that the Newton polygon NP(/) of a polynomial 
f{X) = Yli 0‘iX'' is the lower convex hull of the points 
(i, val(ai)) and the Newton polygon NP(M) of a matrix M is 
NP(char(M)). The Hodge polygon HP(M) of M is the lower 
convex hull of the points (*, '^j(-^))- For any matrix 

M, the polygon NP(M) lies above HP(M) [6, Thm. 4.3.11]. 

Such polygons arise naturally in tracking the precision 
of polynomials [3, §4.2]. Any such polygon P yields a lat¬ 
tice Cp in A<n[A] consisting of polynomials whose Newton 
polygons lie above P. This lattice is generated by monomi¬ 
als fliA*, where val(ai) is the ceiling of the height of P at 
i. These polygons are used in a coordinate-wise precision 
tracking for polynomial arithmetic. We now introduce an¬ 
other polygon, bounded between NP(M) and HP(M), that 
will provide an estimate on the precision of char(M). 

Definition 3.3. The precision polygon PP(M) of M is the 
lower convex hull of the Newton polygon of the entries of 
Com(A-M). 

It is clear from the definition -Cpp(M) C char'(M)(£o) 
where Co is the standard lattice Mn.niOK). More precisely, 
PP(M) is the smallest polygon P for which the inclusion 
Cp C char'(M)(/io) holds. By Proposition 2.1, the preci¬ 
sion polygon determines the minimal precision losses possi¬ 
ble when encoding polynomial precision using polygons. 

It turns out that the precision polygon is related to the 
Hodge and Newton polygons. If a polygon P has vertices 
{xi,yi), we let Tn{P) be the translated polygon with vertices 
{xi - n,yi). 

Proposition 3.4. The precision polygon PP(M) lies be¬ 
tween ri(HP(M)) and NP(M). 

Moreover, PP(M) and ri(HP(M)) meet at 0 and n—1. 

Proof. The coefficients of char(M) can be expressed as 
traces of exterior powers: the coefficient of A' is Tr(A*(M)), 
which is Ti{A‘{Um)M{Am)N‘{Vm)). Computing A'(Am), 
we get the first statement of the Proposition. For i = 1, we 
further Hnd that PP(M) vanishes at the abscissa n—1. By 
dehnition so does ri(HP(M)). The fact that PP(M) and 
ri(HP(M)) meet at abscissa 0 follows from Proposition 3.2. 

It remains to prove the comparison with the Newton poly¬ 
gon. Set / = char(M), set mij as the (i,j)-th entry of 











M, fij as the (i,j)-th entry of Com(X —M) and fiij = 
val(mi,j). We write f[k] for valuation of the coefficient of 
X* in /, and set /[—I] = +oo. The equation (X—M) ■ 
Com{X—M) = f ■ I yields, for all i and k, 

y [^] ^ l^f 1] 5 “t“ fo,i [^] 7 • ■ ■ 7 f n,i [^]) 

> inf(/i,i[fc-l],/yi[fc]), 

with the infimum over j. Taking lower convex hulls and 
noting that PP(M) is nonincreasing, which follows from the 
comparison with the Hodge polygon, we get the result. □ 

Remark 3.5. Experiments actually support the following 
stronger result: PP(M) is bounded above by Ti(NP(M)). 

For many matrices, PP(M) = Ti(HP(M)). For random 
matrices over Z2, the 2-adic precision polygon is equal to 
the Hodge polygon in 99.5% of cases in dimension 4, down 
to 99.1% in dimension 8. Over Z3, the fraction rises to 
99.98%, with no clear dependence on dimension. Empiri¬ 
cally, PP(M) seems most likely to differ from ri(HP(M)) 
at 1, corresponding to the precision of the linear term of the 
characteristic polynomial. 

Of course, the precision lattice £ = char'(M)(£o) may 
contain diffuse precision which is not encapsulated in PP(M). 
Diffuse precision arises in 11% of cases in dimension 3, up to 
15% of cases in dimension 8. This percentage increases as 
val(det(M)) increases, reaching 34% in dimension 9 for ma¬ 
trices constrained to have determinant with 2-adic valuation 
12. Moreover, one can construct examples with arbitrarily 
large amounts of diffuse precision. Suppose the ai{M) are 
large. Proposition 3.4 implies that £ is contained within 
OK,<n[X] with index at least The precision 

lattice of 1 -I- M is obtained from £ via the transformation 
X 1 -^ 1 + X, but PP(1 -I- M) is now flat with height 0. 

For randomly chosen matrices, approximating £ using the 
Hodge polygon loses only a small amount of precision. How¬ 
ever, if the o-i{Mys are large or if M is a translate of such 
a matrix, using lattice precision can be very useful. 

3.4 LU factorization 

In this section, we denote by || • || the subordinate matrix 
norm on M„(K) and, given a positive real number C, we let 
B(C) be the closed ball in Mn{K) centered at the origin of 
radius C. We consider the following subsets of Mn{K): 

• On is the open subset of matrices whose principal minors 
do not vanish (we recall that the latest condition implies the 
existence and the uniqueness of a LU factorization); 

• U„ is the sub-vector space of upper-triangular matrices; 

• Ln (resp. L“) is the sub-afEne space of nilpotent (resp. 
unipotent) lower-triangular matrices. 

Calculus and precision. We choose to normalize the LU 
factorization by requiring that L is unipotent and denote 
by D : On Ln Un the function modeling this decom¬ 
position. The computation of the differential of D has al¬ 
ready been done in [3, Appendix B]. For M € On with 
D{M) = (L, U), the linear mapping D'{M) is given by: 

dM ^ {L- low{dX), up{dX) ■ U) with dX = L~^ ■ dM ■ U~^ 

where low (resp. up) denotes the canonical projection of 
Mn{K) onto Ln (resp. Un)- It is easily checked that D'{M) 
is bijective with inverse given by {A,B) 1—>■ ALf -\- LB. 

We now want to apply Proposition 2.5 in order to derive 
a concrete result on precision. We then assume that K has 
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Figure 3: Loss of precision for LU factorization 

characteristic 0. We pick Mq € On and write D{Mq) = 
{Lo,Uo). We consider the translated function / taking M 
to D{Mo + M) — D{Mo). We then have /(O) = 0 and 
f'(M) = D'{Mq + M). Using the explicit description of 
the inverse of D'{Mo + M), we find f?(l) C /^(O) ■ B{C) for 
C = max(||?7o||, ||Lo||)- Moreover, / satisfies the differential 
equation f'~gof where g is defined by: 

g{A, B){X) = ((Lo + A) ■ low(y), up(y) • {Uo + B)) 
with Y ^ {Lo + A)-^ ■ X ■ {Uo + B)-K 

Let k{S) = IISII • ||5“^|| denote the condition number of a 
matrix S. Remarking that ||5-|-T|| = US’!! if ||r|| < US’!! and 
||(S-I-T)“^|| = ||>S'“^|| if ||T|| < we deduce from (3) 

that ||g(A,B)|| < max (k(Lo)||Uo'"^||7 k(I^o)||Lo ^11 ) provided 
that ||A|| < ||L(7^||~^ and ||i3|| < Combining this 

with Proposition 2.5 and [3, Proposition. 3.12], we finally 
find that Eq. (1) holds as soon as 

^ > Ibll"'* • max(||Lo||, ||Uo||) • max(||Lo ^||, ||Uo"^||) • 

max {K{Lo)\\Uzy\, KiUo)\\Lyy\)^. 

Numerical experiments. Let Bn = {Eij)i<ij<n be the 
canonical basis of Mn{K). It can be naturally seen as a basis 
of Ln X Un as well. For a given M € On, let us abuse notation 
and write D'{M) for the matrix of this linear mapping in the 
above basis. We remark that D'{M) is lower-triangular in 
this basis. Projecting D'[M) onto each coordinate, we find 
the best coordinate-wise loss of precision we can hope for the 
computation of D is given by Ylu (yal{D'{M)n,v)))■ 

This number should be compared to val(det(0'(M))), which 
is precision lost in the lattice method. The number of dif¬ 
fused digits of precision of D'{M){Mn,n{OK)) is then the 
difference between these two numbers. Figure 3 summarizes 
the mean and standard deviation of those losses for a sample 
of 2000 random matrices in Md,d{'^ 2 ). 

4. VECTOR SPACES 

Vector spaces are generally represented as subspaces of 
is'" for some n and hence naturally appear as points on 
Grassmannians. Therefore, one can use the framework of 
[3, Appendix A] to study p-adic precision in this context. 

4.1 Geometry of Grassmannians 

Given E, a finite dimensional vector space over K, and 
d, an integer in the range [0,dimj5], we write Grass(i5, d) 
for the Grassmannian of d-dimensional subspaces oi E. It is 
well-known that Grass(i?,d) has the natural structure of a 
iL-manifold. The aim of this subsection is to recall standard 
facts about its geometry. In what follows, we set n = dim E 
and equip E with a distinguished basis (ei,..., Cn). 










Description and tangent space. Let V denote a fixed 
subspace of E of dimension d. The Grassmannian Grass(-B, d) 
can be viewed as the quotient of the set of linear embed¬ 
dings f V ^ E modulo the action (by precomposition) of 
GL(i/): the mapping / represents its image f{V). It follows 
from this description that the tangent space of Grass(i5,d) 
is canonically isomorphic to Hom(V,-B)/End(y), which is 
nom{V,E/V). 

Charts. Let V and be two complementary subspaces 
of E {i.e. V (B V‘^ = E). We assume that V has dimen¬ 
sion d and denote by tt the projection E ^ V corresponding 
to the above decomposition. We introduce the set Uv,v<^ 
of all embeddings f V ^ E such that tt o / = idv- 
Glearly, it is an affine space over Hom(I/, Further¬ 

more, we can embed it into Grass(B, d) by taking / as 
above to its image. This way, Uv,v<^ appears as an open 
subset of Grass (B,d) consisting exactly of those subspaces 
W such that W DV^ = 0. As a consequence, the tangent 
space at each such W becomes isomorphic to Hom(V,y'^). 
The identification Hom(y, W) —>■ Hom(IT, B/IT) is given 
by du {du o f~^) mod W where f ■. V ^ W is the linear 
mapping defining W. 

When the pair {V, V^) varies, the open subsets Uv,v^ cover 
the whole Grassmannian and define an atlas. When imple¬ 
menting vector spaces on a computer, we usually restrict 
ourselves to the subatlas consisting of all charts of the form 
{Vi , Vic ) where / runs over the family of subsets of {1,..., n} 
of cardinality d and Vi is the subspace spanned by the a’s 
with i £ 1. A subspace W £ E then belongs to at least one 
ldvj,Vjc and, given a family of generators of IT, we can deter¬ 
mine such an I together with the corresponding embedding 
f : Vi ^ E hj row reducing the matrix of generators of W. 

A variant. Alternatively, one can describe Grass(B, d) as 
the set of linear surjective morphisms f ■. E ^ E/V modulo 
the action (by postcomposition) of Gh{E/V). This iden¬ 
tification presents the tangent space at a given point V 
as the quotient Hom(B, B/T)/End(B/y) ~ Hom(y,B/T). 
Given a decomposition B = T © as above, we let Uy yc 
denote the set of surjective linear maps f ■. E ^ V^ whose 
restriction to is the identity. It is an affine space over 
Hom(V) V^) which can be identified with an open subset of 
Grass(B, d) via the map / i-^ ker /. 

It is easily seen that Uyyc and Uy yc define the same 
open subset in Grass(B,d). Indeed, given / £ Uyyc^ one 
can write / = idv + h with h £ Hom(V, V'^) and define the 
morphism g = ids — ho n £ Uyyc. The association f g 
then defines a bijection Uyyc Uyyc which commutes 
with the embeddings into the Grassmannian. 

Duality. If B is a finite dimensional vector space over K, 
we use the notation B* for its dual {i.e. B* = Hom(B,B')). 
If we are also given a subspace V C E, we denote by V^ 
the subspace of B* consisting of linear maps that vanish 
on V. We recall that the dual of V^ (resp. E*/V^) is 
canonically isomorphic to E/V (resp. V). For all d, the 
association V i—^ V^ defines a continuous morphism 'i/e ■ 
Grass(B,d) —>■ Grass(B*,n — d). The action of ipE on tan¬ 
gent spaces is easily described. Indeed, the differential of 
at T is nothing but the canonical identification between 
Hom(V', B/T) and Hom(T^, B*/I/^) induced by transposi¬ 
tion. Furthermore, we observe that i/e respects the charts 
we have defined above, in the sense that it maps bijectively 
Uyyc to UyE ye'll. — UyE ycy. 


4.2 Differential computations 

In this subsection, we compute the differential of various 
operations on vector spaces. For brevity, we skip the esti¬ 
mation of the corresponding growing functions (but this can 
be done using Proposition 2.5 as before if char(B') = 0). 

Direct images. Let B and B be two finite dimensional 
B-vector spaces of dimension n and m, respectively. Let d 
be an integer in [0, n]. We are interested in the direct image 
function DI defined on M — Hom(B, B) x Grass(B, d) that 
takes the pair {f,V) to f{V). Since the dimension of f{V) 
may vary, the map DI does not take its values in a well- 
defined Grassmannian. We therefore stratify Ai as follows: 
for each integer r £ [0, d], let Mr £ M he the subset of pairs 
(/, V) for which f{V) has dimension r. The MVe are locally 
closed in M and are therefore submanifolds. Moreover, DI 
induces differentiable functions DD : Mr —>■ Grass(B, r). 

We would like to differentiate DE around some point 
(/, V) £ Mr. To do so, we use the first description of the 
Grassmannians we gave above: we see points in Grass(B, d) 
(resp. Grass(B, d)) as embeddings V ^ E (resp. W ^ E) 
modulo the action of GL(T) (resp. GL(IF)). The point 
V £ Grass(B, d) is then represented by the canonical inclu¬ 
sion V : V —i E whereas a representative w of W satisfies 
w o ip = f o V where p ■. V ^ W is the linear mapping 
induced by /. The previous relation still holds if (/, v) is 
replaced by a pair (/',«') £ Mr sufficiently close to {f,v). 
Differentiating it and passing to the quotient we find, first, 
that the tangent space of Mr at (/, v) consists of pairs 
{df,dv) £ Hom(B,B) x Kom{V, E/V) such that 

dw = {{df o V + f o dv) mod IT) £ Hom(T, E/W) 

factors through p {i.e. vanishes on kerip = T n ker/) and, 
second, that the differential of DE at (/, V) is the linear 
mapping sending {df, dv) as above to the unique element 
dw £ Hom(IT, B/IT) such that dw o p = dw. 

Inverse images. We now consider the inverse image map¬ 
ping II sending a pair (/, IT) £W = Horn(B, B)x Grass(B, d) 
to f~^{W). As before, this map does not take values in 
a single Grassmannian, so we need to stratify W in or¬ 
der to get differentiable functions. For each integer s £ 
[0,n], we introduce the submanifold Wa of W consisting of 
those pairs (/, IT) such that dim/“^(IT) = s. For all s, 
II induces a continuous function IE : Ws —^ Grass(B, s). 
Pick (/, IT) £ Ws. Set V = f~^{W) and denote by w : 
F —^ B/IT the canonical projection. Similarly to what we 
have done for direct images, one can prove that the tangent 
space of Ws at some point (/, IT) £ Ws is the subspace 
of Hom(B, B) X Hom(IT, F/W) consisting of pairs {df, dw) 
such that dv = {wodf + dwo f)y factors through the linear 
mapping ip : E/V —>■ F/W induced by /. Furthermore IE 
is differentiable at (/, IT) and its differential is the linear 
mapping that takes {df, dw) as above to the unique element 
dv £ Hom(T, E/V) satisfying p o dv = dv. 

Direct images and inverse images are related by duality as 
follows: if / : B —>■ B is any linear map and IT is a subspace 
of B, then /*(IT^) = /~^(IT)^. We can thus deduce the 
differentials of DE from those of IE and vice versa. 

Sums and intersections. Let di and d 2 be two nonneg¬ 
ative integers. We consider the function S defined on the 
manifold C = Grass(B,di) x Grass(B,d 2 ) by E(Ti,T 2 ) = 
Ti -I-T 2 . As before, in order to study E, we stratify C accord¬ 
ing to the dimension of the sum: for each integer d £ [ 0 , di + 


d 2 ], we define Cd as the submanifold of C consisting of those 
pairs (Vi, V 2 ) such that dim(Vi + V 2 ) = d. We get a well- 
defined mapping Cd —>■ Grass(i?, d) whose differential can 
be computed as before. The tangent space of Cd at a given 
point (Vi, V 2 ) consists of pairs {dvi, dv 2 ) G Hom(Vi, i?/Vi) x 
Hom(V 2 , iJ/Vi) such that dvi = dv 2 (mod Vi -I- V 2 ) on the 
intersection VinV 2 , and the differential of E at (Vi, V 2 ) maps 
{dvi,dv 2 ) to dv G Yioraiy, E/V) (with V = Vi + V 2 ) defined 
by dv{vi + V 2 ) = dvi{vi) + dv 2 {v 2 ) {vi G Vi, V 2 G V 2 ). 

Using duality, we derive a similar result for the mapping 
(Vi, V2) Ui n V2 (left to the reader). 
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Figure 4: Average loss of precision in Algorithm 2 


4.3 Implementation and experiments 


standard representation of vector spaces. One com¬ 
monly represents subspaces of K" using the charts Llvi,Vjc 
(where I is a subset of {1,..., n}) introduced above. More 
concretely, a subspace V C K'^ is represented as the span of 
the rows of a matrix Gv having the following extra property: 
there exists some I C {1,..., n} such that the submatrix of 
Gv obtained by keeping only columns with index in I is the 
identity matrix. We recall that such a representation always 
exists and, when the set of indices / is fixed, at most one Gv 
satisfies the above condition. Given a family of generators 
of V, one can compute Gv and I as above by performing 
standard row reduction. Choosing the first non-vanishing 
pivot at every stage provides a canonical choice for 7, but 
in the context of inexact base fields, choosing the pivot with 
the maximal norm yields a more stable algorithm. 


The dual representation. Of course, one may alterna¬ 
tively use the charts lAv^^Vjc- Concretely, this means that 
we represent V as the left kernel of a matrix Hv having the 
following extra property: there exists some I C {1,... ,n} 
such that the submatrix of Hv obtained by deleting rows 
with index in I is the identity matrix. As above, we can 
then compute I and Hv by performing column reduction. 

Note that switching representations is cheap and stable. If 
I = {1,..., d} with d — dim V and Id is the identity matrix 
of size d, the matrix Gv has the form {Id G'v). One can 


represent V with the same 7 and the matrix Hv 



A similar formula exists for general 7. 


Operations on vector spaces. The first representation we 
gave is well suited for the computation of direct images and 
sums. For instance, to compute f{V) we apply / to each 
row of Gv, obtaining a family of generators of f{V), and 
then row reduce. Dually, the second representation works 
well for computing inverse images, including kernels, and 
intersections. Since translating between the two dual repre¬ 
sentations is straightforward, we get algorithms for solving 
both problems using either representation. 

Some experiments. Let us consider the example compu¬ 
tation given in the following algorithm. 


Algorithm 2: exainple_vector_space 

Input: two integers n and N 

1. Set Lo = ((1 -b 0(2"^), 0(2"^), 0(2"^))) C Qi 

2 . for i = 0 ,..., n — 1 

3. pick randomly a, ^,^,5 G A 73 , 3 (Z 2 ) with high precision 

4. compute Li+i = {a{Li) + fd{Li)) n {^{Li) + 5{Li)) 

5. return L„ 


The expression high precision on line 3 means that the pre¬ 
cision on a, /3, 7 and S is set in such a way that it does not 
affect the resulting precision on Li+i. Figure 4 shows the 
losses of precision when executing Algorithm 2 with various 
inputs n (the input N is always chosen sufficiently large so 
that it does not affect the behavior of precision). The Coord- 
wise column corresponds to the standard way of tracking 
precision. On the other hand, in the two last columns, 
the precision is tracked using lattices. The Diffused col¬ 
umn gives the amount of diffused precision, factored to be 
comparible to the Coord-wise column. The fact that only 
negative numbers appear in this column means that we are 
actually always gaining precision with this model! Finally, 
the Projected column gives the precision loss after projecting 
the lattice precision onto coordinates. 


APPENDIX 

A. PROOF OF PROPOSITION 2.4 

We prove Proposition 2.4 in the slightly more general con¬ 
text of Tf-Banach spaces. 

A.I Composite of locally analytic functions 

Let U, V and W be three open subsets in 77-Banach 
spaces E, F and G, respectively. We assume that 0 G 77, 
0 G U. Let f : U ^ V and g : V ^ W he two locally 
analytic functions around 0 with /(O) = 0. The composi¬ 
tion h = g o f is then locally analytic around 0 as well. Let 
/ = E„>o U 9 = En>o and h = Y.n>o ^e the ana¬ 
lytic expansions of f, g and h. Here /„, g^ and hn are the 
restrictions to the diagonal of some symmetric n-linear forms 
Fn, Gn and H„, respectively. The aim of this subsection is 
to prove the following intermediate result. 

Proposition A.I. With the above notation, we have 

l|7lr|| < sup IlSmIl • ||/ni||---||/n,„|| 

m,(ni ) 

for all nonnegative integers r, where the supremum is taken 
over all pairs (m, (n^)) where m is a nonnegative integer and 
(ni)i<i<m is a sequence of length m of nonnegative integers 
such that ni -b ... -b n„i = r. 

A computation gives the following expansion for g o f: 


... I • • • I /"!> ■ ■ ■ > , ■ ■ ■, fnt ) (4) 

where ^ ^ fc ) ‘denotes the multinomial coefficient and 
the sum runs over: 

(a) all finite sequences (fci) of positive integers whose length 
(resp. sum) is denoted by I (resp. m), and 











(b) all finite sequences (ui) of positive integers of length £. 
Moreover, in the argument of Gm, the variable fm is re¬ 
peated ki times. 

The degree of Gm{Ui, ■ ■ ■, Ui, ■ ■ ■, Ut, ■ ■ ■, Ut) is r = 
feini -I- ... -I- kem and then contributes to hr- As a con¬ 
sequence hr is equal to (4) where the sum is restricted to 
sequences (ki), (m) such that kim kini = r. Propo¬ 

sition A.l now follows from the next lemma. 

Lemma A.2. Let E be a K-vector space. Let ip : E'^ —^ 
K be a symmetric m-linear form and if : E ^ K defined 
by ip{x) = ip{x,x, ... ,x). Given positive integers k\,... ,kt 
whose sum is m and x\, ... ,xi £ E, we have 

IK fei k^--- 

where, in the LHS, the variable Xi is repeated ki times. 
Proof. It is enough to prove that 


for all nonnegative integers r, where the supremum runs 
over all pairs (m, (rii)) where m is a nonnegative integer and 
(ni)i<i<m is a sequence of length m of nonnegative integers 
such that ni + .. .+nm = r. We set Ur = ||r!/r||. Multiplying 
the above inequality by ||r!||, we obtain: 

m 

Ur+i< sup llg^ll ■ ]Tmax(M„., ||ni!/i„J|) (5) 

since the multinomial coefficient ( ^ ) is an integer 

\ni ... Urn J 

and hence has norm at most 1. We now pick two real num¬ 
bers a and b satisfying the hypothesis of the Lemma. Set d — 
A{h){a). Going back to the definitions of A{h) and Legendre 
transform, we get ||h„|| < for all n. Similarly, from 

our hypothesis on (a,b), we find ||gm|| < 
for all m. We are now ready to prove Ur < by induc¬ 

tion on r. When r = 0, it is obvious because uo vanishes. 
Otherwise, the induction follows from 


{ kik2 ■■■ ki) 


.,xi,...,xi,...,xe)\\ < IIV'II 


provided that all the Xi’s have norm at most 1. We pro¬ 
ceed by induction on i. The £ = 1 case follows directly 
from the definition of ||'i/’||- We now pick (£ + 1) integers 
fei,..., ki+i whose sum equals m, together with {£ + 1) el¬ 
ements xi,..., Xi.^.l lying in the unit ball of E. We also 
consider a new variable A varying in Ok- We set x'i = Xi, 
k'i = ki when i < £ and x'i = xi + Xxi+i and k'i = ki-\- 
By the induction hypothesis, we know that the inequality 


( 




.,x'i,...,x'i,...,x'i)\\ < ||l^|| 


holds for all \ £ K. Furthermore, the LHS of the inequality 
is a polynomial P(A) of degree k'i whose coefficient in A' is 




fci 


ki-i j 


) 


with = {xi,... ,x-i_,... ,xi+-i,... ,Xi+i) where Xi is re¬ 
peated ki times if i < £ and Xi (resp. Xi+i) is repeated j 
times (resp. k'^ — j times). Since ||P(A)|| < ||'!^|| for all A in 
the unit ball, the norm of all its coefficients must also be at 
most ||i/>||. From the coefficient of A*^^, the result follows. □ 


A.2 Bounding a growing function 

We return to the setting of Proposition 2.4. Let / = 
E„>o/>^. 9 = J2r>o9n and h = En>o ^e the analytic 
expansions of /, g and h. Here fn, Qn and hn are the re¬ 
strictions to the diagonal of some symmetric n-linear forms 
Fn, Gn and Hn, respectively. We recall that A(/) is the 
Legendre transform of the Newton polygon NP(/) defined 
in Section 3 [3, Proposition 3.9], and that q is a real number 
such that ||n!|| > for all positive integers n. 

Lemma A.3. We keep the above notation. If {a, b) satisfies 
b > a + A(g)(max(6, A(/i)(a))) then b > A(/)(a — q). 
Proof. We have /' = where 

fn'.U^ C{E, F), X [h n ■ Fn{h,x,x,... ,x)). 

Taking h = x,we find ||/4|| > ||n/„|| = ||n|| • ||/„||. Combin¬ 
ing this with Proposition A.l, we get 


m 

\\{r+l)fr+l\\< sup llffmll ■ ll^"ill) 


^ — m.ax(6,«i) + ( —an j +max( 6, d)) 

\ sup G 

m,(nj) 

_ —a — ar ^ — a(r+l) + 6 

From the definition of Ur, we obtain ||/r|| < Ur ■ ||r!||“^ < 
e-G-Ar+b^ Thus b > A(/)(a - a). □ 

We can now conclude the proof of Proposition 2.4 as fol¬ 
lows. Given a £ R and b = a-\- Ag o Ah{a), we have to prove 
that A(/)(a —a) < b provided that b < u. Thanks to Propo¬ 
sition 2.4, it is enough to check that such pairs (a, b) satisfy 
the hypothesis of Lemma A.3. Clearly: b > a+A{g)oA{h){a) 
since Ag > A{g), Ah > A{h) and Ag is nondecreasing. Fur¬ 
thermore, from b < u, we get Ag{b) — min2,gi[{ A3(a;) < Ag o 
Ah{a), from which we derive a-|-Ag(&) < a -I- Ag o Ajt(a) = b. 
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